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A  new  paradigm,  Random  Sample  Consensus  ' 
(ransac),  for  fitting  a  model  to  experimental  data  is 
introduced,  ransac  is  capable  of  interpreting/ 
smoothing  data  containing  a  significant  percentage  of 
gross  errors,  and  is  thus  ideally  suited  for  applications 
in  automated  image  analysis  where  interpretation  is 
based  on  the  data  provided  by  error-prone  feature 
detectors.  A  major  portion  of  this  paper  describes  the 
application  of  ransac  to  the  Location  Determination 
Problem  (LDP):  Given  an  image  depicting  a  set  of 
landmarks  with  known  locations,  determine  that  point 
in  space  from  which  the  image  was  obtained.  In 
response  to  a  ransac  requirement,  new  results  are 
derived  on  the  minimum  number  of  landmarks  needed 
to  obtain  a  solution,  and  algorithms  are  presented  for 
computing  these  minimum-landmark  solutions  in  closed 
form.  These  results  provide  the  basis  for  an  automatic 
system  that  can  solve  the  LDP  under  difficult  viewing 
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and  analysis  conditions.  Implementation  details  and 
computational  examples  are  also  presented. 
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determination,  automated  cartography. 
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I.  Introduction 

We  introduce  a  new  paradigm,  Random  Sample 
Consensus  (ransac),  for  fitting  a  model  to  experimental 
data;  and  illustrate  its  use  in  scene  analysis  and  auto¬ 
mated  cartography.  The  application  discussed,  the  loca¬ 
tion  determination  problem  (LDP),  is  treated  at  a  level 
beyond  that  of  a  mere  example  of  the  use  of  the  ransac 
paradigm;  new  basic  findings  concerning  the  conditions 
under  which  the  LDP  can  be  solved  are  presented  and 
a  comprehensive  approach  to  the  solution  of  this  problem 
that  we  anticipate  will  have  near-term  practical  appli¬ 
cations  is  described. 

To  a  large  extent,  scene  analysis  (and,  in  fact,  science 
in  general)  is  concerned  with  the  interpretation  of  sensed 
data  in  terms  of  a  set  of  predefined  models.  Conceptually, 
interpretation  involves  two  distinct  activities:  First,  there 
is  the  problem  of  finding  the  best  match  between  the 
data  and  one  of  the  available  models  (the  classification 
problem);  Second,  there  is  the  problem  of  computing  the 
best  values  for  the  free  parameters  of  the  selected  model 
(the  parameter  estimation  problem).  In  practice,  these 
two  problems  are  not  independent — a  solution  to  the 
parameter  estimation  problem  is  often  required  to  solve 
the  classification  problem. 

Classical  techniques  for  parameter  estimation,  such 
as  least  squares,  optimize  (according  to  a  specified  ob¬ 
jective  function)  the  fit  of  a  functional  description 
(model)  to  all  of  the  presented  data.  These  techniques 
have  no  internal  mechanisms  for  detecting  and  rejecting 
gross  errors.  They  are  averaging  techniques  that  rely  on 
the  assumption  (the  smoothing  assumption)  that  the 
maximum  expected  deviation  of  any  datum  from  the 
assumed  model  is  a  direct  function  of  the  size  of  the  data 
set,  and  thus  regardless  of  the  size  of  the  data  set,  there 
will  always  be  enough  good  values  to  smooth  out  any 
gross  deviations. 

In  many  practical  parameter  estimation  problems  the 
smoothing  assumption  does  not  hold;  i.e.,  the  data  con¬ 
tain  uncompensated  gross  errors.  To  deal  with  this  situ¬ 
ation,  several  heuristics  have  been  proposed.  The  tech¬ 
nique  usually  employed  is  some  variation  of  first  using 
all  the  data  to  derive  the  model  parameters,  then  locating 
the  datum  that  is  farthest  from  agreement  with  the 
instantiated  model,  assuming  that  it  is  a  gross  error, 
deleting  it,  and  iterating  this  process  until  either  the 
maximum  deviation  is  less  then  some  preset  threshold  or 
until  there  is  no  longer  sufficient  data  to  proceed. 

It  can  easily  be  shown  that  a  single  gross  error 
(“poisoned  point”),  mixed  in  with  a  set  of  good  data,  can 
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Fig.  1.  Failure  of  Least  Squares  (and  the  “Throwing  Out  The  Worst  Residual”  Heuristic),  to  Deal  with  an  Erroneous  Data  Point. 

PROBLEM:  Given  the  set  of  seven  (x,y)  pairs  shown  in  the  plot,  find  a  best  fit  line,  assuming  that  no  valid  datum 
deviates  from  this  line  by  more  than  0.8  units. 
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COMMENT :  Six  of  the  seven  points  are  valid  data  and  can  be  fit  by  the  solid  line.  Using  Least  Squares  (and  the 
“throwing  out  the  worst  residual’’  heuristic),  we  terminate  after  four  iterations  with  four  remaining 
points,  including  the  gross  error  at  (10,2)  fit  by  the  dashed  line. 


SUCCESSIVE  LEAST  SQUARES  APPROXIMATIONS 

ITERATION 

DATA  SET 

FITTING  LINE 

1 

1,2,3,  4,  5,  6,  7 

1.48  +  ,16x 

2 

1,2,  3,  4,  5,7 

1.25  +  ,13x 

3 

1,2,3,  4,  7 

0.96  +  ,14x 

4 

2,3,  4,7 

1.51  +  .06x 

COMPUTATION  OF  RESIDUALS 

POINT 

ITERATION  1 
RESIDUALS 

ITERATION  2 
RESIDUALS 

ITERATION  3 
RESIDUALS 

ITERATION  4 
RESIDUALS 

1 

-1.48 

-1.25 

-.96* 

— 

2 

-0.64 

-0.38 

-.10 

-.57 

3 

-0.20 

0.49 

.76 

.37 

4 

0.05 

0.36 

.63 

.31 

5 

1.05 

1.36* 

— 

— 

6 

1.89* 

— 

— 

— 

7 

-1.06 

-0.57 

-.33 

-.11 
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cause  the  above  heuristic  to  fail  (for  example,  see  Figure 
1).  It  is  our  contention  that  averaging  is  not  an  appro¬ 
priate  technique  to  apply  to  an  unverified  data  set. 

In  the  following  section  we  introduce  the  ransac 
paradigm,  which  is  capable  of  smoothing  data  that  con¬ 
tain  a  significant  percentage  of  gross  errors.  This  para¬ 
digm  is  particularly  applicable  to  scene  analysis  because 
local  feature  detectors,  which  often  make  mistakes,  are 
the  source  of  the  data  provided  to  the  interpretation 
algorithms.  Local  feature  detectors  make  two  types  of 
errors — classification  errors  and  measurement  errors. 
Classification  errors  occur  when  a  feature  detector  in¬ 
correctly  identifies  a  portion  of  an  image  as  an  occur¬ 
rence  of  a  feature.  Measurement  errors  occur  when  the 
feature  detector  correctly  identifies  the  feature,  but 
slightly  miscalculates  one  of  its  parameters  (e.g.,  its  image 
location).  Measurement  errors  generally  follow  a  normal 
distribution,  and  therefore  the  smoothing  assumption  is 
applicable  to  them.  Classification  errors,  however,  are 
gross  errors,  having  a  significantly  larger  effect  than 
measurement  errors,  and  do  not  average  out. 

In  the  final  sections  of  this  paper  the  application  of 
ransac  to  the  location  determination  problem  is  dis¬ 
cussed: 

Given  a  set  of  “landmarks”  (“control  points”),  whose  locations  are 
known  in  some  coordinate  frame,  determine  the  location  (relative  to 
the  coordinate  frame  of  the  landmarks)  of  that  point  in  space  from 
which  an  image  of  the  landmarks  was  obtained. 

In  response  to  a  ransac  requirement,  some  new 
results  are  derived  on  the  minimum  number  of  land¬ 
marks  needed  to  obtain  a  solution,  and  then  algorithms 
are  presented  for  computing  these  minimum-landmark 
solutions  in  closed  form.  (Conventional  techniques  are 
iterative  and  require  a  good  initial  guess  to  assure  con¬ 
vergence.)  These  results  form  the  basis  for  an  automatic 
system  that  can  solve  the  LDP  under  severe  viewing  and 
analysis  conditions.  In  particular,  the  system  performs 
properly  even  if  a  significant  number  of  landmarks  are 
incorrectly  located  due  to  low  visibility,  terrain  changes, 
or  image  analysis  errors.  Implementation  details  and 
experimental  results  are  presented  to  complete  our  de¬ 
scription  of  the  LDP  application. 


II.  Random  Sample  Consensus 

The  ransac  procedure  is  opposite  to  that  of  conven¬ 
tional  smoothing  techniques:  Rather  than  using  as  much 
of  the  data  as  possible  to  obtain  an  initial  solution  and 
then  attempting  to  eliminate  the  invalid  data  points, 
ransac  uses  as  small  an  initial  data  set  as  feasible  and 
enlarges  this  set  with  consistent  data  when  possible.  For 
example,  given  the  task  of  fitting  an  arc  of  a  circle  to  a 
set  of  two-dimensional  points,  the  ransac  approach 
would  be  to  select  a  set  of  three  points  (since  three  points 
are  required  to  determine  a  circle),  compute  the  center 
and  radius  of  the  implied  circle,  and  count  the  number 
of  points  that  are  close  enough  to  that  circle  to  suggest 
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their  compatibility  with  it  (i.e.,  their  deviations  are  small 
enough  to  be  measurement  errors).  If  there  are  enough 
compatible  points,  ransac  would  employ  a  smoothing 
technique  such  as  least  squares,  to  compute  an  improved 
estimate  for  the  parameters  of  the  circle  now  that  a  set 
of  mutually  consistent  points  has  been  identified. 

The  ransac  paradigm  is  more  formally  stated  as 
follows: 

Given  a  model  that  requires  a  minimum  of  n  data  points  to  instantiate 
its  free  parameters,  and  a  set  of  data  points  P  such  that  the  number  of 
points  in  P  is  greater  than  n  [t(P)  2:  n],  randomly  select  a  subset  SI  of 
n  data  points  from  P  and  instantiate  the  model.  Use  the  instantiated 
model  A/1  to  determine  the  subset  SI*  of  points  in  P  that  are  within 
some  error  tolerance  of  A/1.  The  set  SI*  is  called  the  consensus  set  of 
SI. 

If  #  (SI*)  is  greater  than  some  threshold  t,  which  is  a  function  of  the 
estimate  of  the  number  of  gross  errors  in  P,  use  SI*  to  compute 
(possibly  using  least  squares)  a  new  model  A/1*. 

If  #  (SI*)  is  less  than  t,  randomly  select  a  new  subset  S2  and  repeat  the 
above  process.  If,  after  some  predetermined  number  of  trials,  no 
consensus  set  with  t  or  more  members  has  been  found,  either  solve  the 
model  with  the  largest  consensus  set  found,  or  terminate  in  failure. 

There  are  two  obvious  improvements  to  the  above 
algorithm:  First,  if  there  is  a  problem  related  rationale 
for  selecting  points  to  form  the  S’s,  use  a  deterministic 
selection  process  instead  of  a  random  one;  second,  once 
a  suitable  consensus  set  S*  has  been  found  and  a  model 
M*  instantiated,  add  any  new  points  from  P  that  are 
consistent  with  M*  to  S*  and  compute  a  new  model  on 
the  basis  of  this  larger  set. 

The  ransac  paradigm  contains  three  unspecified 
parameters:  (1)  the  error  tolerance  used  to  determine 
whether  or  not  a  point  is  compatible  with  a  model,  (2) 
the  number  of  subsets  to  try,  and  (3)  the  threshold  t, 
which  is  the  number  of  compatible  points  used  to  imply 
that  the  correct  model  has  been  found.  Methods  are 
discussed  for  computing  reasonable  values  for  these  pa¬ 
rameters  in  the  following  subsections. 

A.  Error  Tolerance  For  Establishing  Datum/Model 
Compatibility 

The  deviation  of  a  datum  from  a  model  is  a  function 
of  the  error  associated  with  the  datum  and  the  error 
associated  with  the  model  (which,  in  part,  is  a  function 
of  the  errors  associated  with  the  data  used  to  instantiate 
the  model).  If  the  model  is  a  simple  function  of  the  data 
points,  it  may  be  practical  to  establish  reasonable  bounds 
on  error  tolerance  analytically.  However,  this  straight¬ 
forward  approach  is  often  unworkable;  for  such  cases  it 
is  generally  possible  to  estimate  bounds  on  error  toler¬ 
ance  experimentally.  Sample  deviations  can  be  produced 
by  perturbing  the  data,  computing  the  model,  and  mea¬ 
suring  the  implied  errors.  The  error  tolerance  could  then 
be  set  at  one  or  two  standard  deviations  beyond  the 
measured  average  error. 

The  expected  deviation  of  a  datum  from  an  assumed 
model  is  generally  a  function  of  the  datum,  and  therefore 
the  error  tolerance  should  be  different  for  each  datum. 
However,  the  variation  in  error  tolerances  is  usually 
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relatively  small  compared  to  the  size  of  a  gross  error. 
Thus,  a  single  error  tolerance  for  all  data  is  often  suffi¬ 
cient. 

B.  The  Maximum  Number  of  Attempts  to  Find  a  Con¬ 
sensus  Set 

The  decision  to  stop  selecting  new  subsets  of  P  can 
be  based  upon  the  expected  number  of  trials  k  required 
to  select  a  subset  of  n  good  data  points.  Let  w  be  the 
probability  that  any  selected  data  point  is  within  the 
error  tolerance  of  the  model.  Then  we  have: 

E(k)  =  b  +  2*(1  -  b)*b  +  3*(1  -  b)2*b 

•■•  +  z*(l  -  by-^b  +  ... , 

E(k)  =  b*[  1  +  2 *a  +  3  V  . . .  +  /V-1  +...], 

where  E  (k)  is  the  expected  value  of  k,  b  =  wn,  and 
a  =  (1  -  b ). 

An  identity  for  the  sum  of  a  geometric  series  is 

a/(l  —  a)  —  a  +  u2  +  a3  •  •  •  +  a‘  +  •  •  •  . 

Differentiating  the  above  identity  with  respect  to  a , 
we  have: 

1/(1  -  a)2  =  1  +  2 *a  +  3 *a2  ■■•  +  . 

Thus, 

E(k)  =  \/b  =  w~n 

The  following  is  a  tabulation  of  some  values  of  E(k) 
for  corresponding  values  of  n  and  w: 


w 

n  =  1 

2 

3 

4 

5 

6 

0.9 

l.l 

1.2 

1.4 

1.5 

1.7 

1.9 

0.8 

1.3 

1.6 

2.0 

2.4 

3.0 

3.8 

0.7 

1.4 

2.0 

2.9 

4.2 

5.9 

8.5 

0.6 

1.7 

2.8 

4.6 

7.7 

13 

21 

0.5 

2.0 

4.0 

8.0 

16 

32 

64 

0.4 

2.5 

6.3 

16 

39 

98 

244 

0.3 

3.3 

11 

37 

123 

412 

— 

0.2 

5.0 

25 

125 

625 

— 

— 

In  general,  we  would  probably  want  to  exceed  E\k) 
trials  by  one  or  two  standard  deviations  before  we  give 
up.  Note  that  the  standard  deviation  of  k,  SD(k),  is  given 
by: 

SD(k)  =  sqrt  [ E(k 2)  -  E(k)2]. 

Then 

oo 

E(k 2)  =  2  (1 b*i2*a i"1), 

i=0 

00  00 

=  2  -  l)*ai_1]  +  2  ( hW_1 ), 

i= 0  i=0 

but  (using  the  geometric  series  identity  and  two  differ¬ 
entiations): 

oo 

2a/(l  -  af  =  2  (/*(/  -  l)*^1). 


Thus, 

E(k2)  =  (2  -  b)/(b2), 
and 

SD(k )  =  [sqrt  (1  -  OJ^l/w"). 

Note  that  generally  SD(k)  will  be  approximately 
equal  to  E(k);  thus,  for  example,  if  (w  =  0.5)  and  (n  =  4), 
then  E{k)  =  16  and  SD(k)  =  15.5.  This  means  that  one 
might  want  to  try  two  or  three  times  the  expected  number 
of  random  selections  implied  by  k  (as  tabulated  above) 
to  obtain  a  consensus  set  of  more  than  t  members. 

From  a  slightly  different  point  of  view,  if  we  want  to 
ensure  with  probability  z  that  at  least  one  of  our  random 
selections  is  an  error-free  set  of  n  data  points,  then  we 
must  expect  to  make  at  least  k  selections  (n  data  points 
per  selection),  where 

(1  -  b)k  =  (1  -  z), 

k  =  [log(l  -  z)]/[log(l  -  b)l 

For  example,  if  (w  =  0.5)  and  (n  =  4),  then  (b  = 
1/16).  To  obtain  a  90  percent  assurance  of  making  at 
least  one  error-free  selection, 

k  =  log(0.1)/log(15/16)  =  35.7. 

Note  that  if  w"  <K  1,  then  k  ~  log(l  —  z)E(k).  Thus  if 
z  =  0.90  and  /  <sc  1,  then  k  ~  2.3 E(k);  if  z  =  0.95  and 
wn  «  1,  then  k  ~  3.0 E(k). 

C.  A  Lower  Bound  On  the  Size  of  an  Acceptable  Con¬ 
sensus  Set 

The  threshold  t,  an  unspecified  parameter  in  the 
formal  statement  of  the  ransac  paradigm,  is  used  as  the 
basis  for  determining  that  an  n  subset  of  P  has  been 
found  that  implies  a  sufficiently  large  consensus  set  to 
permit  the  algorithm  to  terminate.  Thus,  t  must  be  chosen 
large  enough  to  satisfy  two  purposes:  that  the  correct 
model  has  been  found  for  the  data,  and  that  a  sufficient 
number  of  mutually  consistent  points  have  been  found 
to  satisfy  the  needs  of  the  final  smoothing  procedure 
(which  computes  improved  estimates  for  the  model  pa¬ 
rameters). 

To  ensure  against  the  possibility  of  the  final  consen¬ 
sus  set  being  compatible  with  an  incorrect  model,  and 
assuming  that  y  is  the  probability  that  any  given  data 
point  is  within  the  error  tolerance  of  an  incorrect  model, 
we  would  like  y‘~n  to  be  very  small.  While  there  is  no 
general  way  of  precisely  determining  y,  it  is  certainly 
reasonable  to  assume  that  it  is  less  than  w  (w  is  the 
a  priori  probability  that  a  given  data  point  is  within  the 
error  tolerance  of  the  correct  model).  Assuming  y  <  0.5, 
a  value  of  t  —  n  equal  to  5  will  provide  a  better  than  95 
percent  probability  that  compatibility  with  an  incorrect 
model  will  not  occur. 

To  satisfy  the  needs  of  the  final  smoothing  procedure, 
the  particular  procedure  to  be  employed  must  be  speci¬ 
fied.  If  least-squares  smoothing  is  to  be  used,  there  are 
many  situations  where  formal  methods  can  be  invoked 
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to  determine  the  number  of  points  required  to  produce 
a  desired  precision  [10]. 

D.  Example 

Let  us  apply  ransac  to  the  example  described  in 
Figure  1.  A  value  of  w  (the  probability  that  any  selected 
data  point  is  within  the  error  tolerance  of  the  model) 
equal  to  0.85  is  consistent  with  the  data,  and  a  tolerance 
(to  establish  datum/model  compatibility)  of  0.8  units 
was  supplied  as  part  of  the  problem  statement.  The 
RANSAC-supplied  model  will  be  accepted  without  exter¬ 
nal  smoothing  of  the  final  consensus  set;  thus,  we  would 
like  to  obtain  a  consensus  set  that  contains  all  seven  data 
points.  Since  one  of  these  points  is  a  gross  error,  it  is 
obvious  that  we  will  not  find  a  consensus  set  of  the 
desired  size,  and  so  we  will  terminate  with  the  largest  set 
we  are  able  to  find.  The  theory  presented  earlier  indicates 
that  if  we  take  two  data  points  at  a  time,  compute  the 
line  through  them  and  measure  the  deviations  of  the 
remaining  points  from  this  line,  we  should  expect  to  find 
a  suitable  consensus  set  within  two  or  three  trials;  how¬ 
ever,  because  of  the  limited  amount  of  data,  we  might  be 
willing  to  try  all  21 -combinations  to  find  the  largest 
consensus  set.  In  either  case,  we  easily  find  the  consensus 
set  containing  the  six  valid  data  points  and  the  line  that 
they  imply. 


III.  The  Location  Determination  Problem  (LDP) 

A  basic  problem  in  image  analysis  is  establishing  a 
correspondence  between  the  elements  of  two  represen¬ 
tations  of  a  given  scene.  One  variation  of  this  problem, 
especially  important  in  cartography,  is  determining  the 
location  in  space  from  which  an  image  or  photograph 
was  obtained  by  recognizing  a  set  of  landmarks  (control 
points)  appearing  in  the  image  (this  is  variously  called 
the  problem  of  determining  the  elements  of  exterior 
camera  orientation,  or  the  camera  calibration  problem, 
or  the  image-to-database  correspondence  problem).  It  is 
routinely  solved  using  a  least-squares  technique  [11,  8] 
with  a  human  operator  interactively  establishing  the 
association  between  image  points  and  the  three-dimen¬ 
sional  coordinates  of  the  corresponding  control  points. 
However,  in  a  fully  automated  system,  where  the  corre¬ 
spondences  must  be  based  on  the  decisions  of  marginally 
competent  feature  detectors,  least  squares  is  often  incap¬ 
able  of  dealing  with  the  gross  errors  that  may  result;  this 
consideration,  discussed  at  length  in  Sec.  II,  is  illustrated 
for  the  LDP  in  an  example  presented  in  Sec.  IV. 

In  this  section  a  new  solution  to  the  LDP  is  presented 
based  on  the  ransac  paradigm,  which  is  unique  in  its 
ability  to  tolerate  gross  errors  in  the  input  data.  We  will 
first  examine  the  conditions  under  which  a  solution  to 
the  LDP  is  possible  and  describe  new  results  concerning 
this  question;  we  then  present  a  complete  description  of 
the  RANSAC-based  algorithm,  and  finally,  describe  exper¬ 
imental  results  obtained  through  use  of  the  algorithm. 
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Fig.  2.  Geometry  of  the  Location  Determination  Problem. 
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The  LDP  is  formally  defined  as  follows: 

Given  a  set  of  m  control  points,  whose  3-dimensional  coordinates  are 
known  in  some  coordinate  frame,  and  given  an  image  in  which  some 
subset  of  the  m  control  points  is  visible,  determine  the  location  (relative 
to  the  coordinate  system  of  the  control  points)  from  which  the  image 
was  obtained. 

We  will  initially  assume  that  we  know  the  corre¬ 
spondences  between  n  image  points  and  control  points; 
later  we  consider  the  situation  in  which  some  of  these 
correspondences  are  invalid.  We  will  also  assume  that 
both  the  principal  point  in  the  image  plane  (where  the 
optical  axis  of  the  camera  pierces  the  image  plane)  and 
the  focal  length  (distance  from  the  center  of  perspective 
to  the  principal  point  in  the  image  plane)  of  the  imaging 
system  are  known;  thus  (see  Figure  2)  we  can  easily 
compute  the  angle  to  any  pair  of  control  points  from  the 
center  of  perspective  (CP).  Finally,  we  assume  that  the 
camera  resides  outside  and  above  a  convex  hull  enclosing 
the  control  points. 

We  will  later  demonstrate  (Appendix  A)  that  if  we 
can  compute  the  lengths  of  the  rays  from  the  CP  to  three 
of  the  control  points  then  we  can  directly  solve  for  the 
location  of  the  CP  (and  the  orientation  of  the  image 
plane  if  desired).  Thus,  an  equivalent  but  mathematically 
more  concise  statement  of  the  LDP  is 

Given  the  relative  spatial  locations  of  n  control  points,  and  given  the 
angle  to  every  pair  of  control  points  from  an  additional  point  called 
the  Center  of  Perspective  (CP),  find  the  lengths  of  the  line  segments 
(“legs”)  joining  the  CP  to  each  of  the  control  points.  We  call  this  the 
“perspective-n-point”  problem  (PnP). 

In  order  to  apply  the  ransac  paradigm,  we  wish  to 
determine  the  smallest  value  of  n  for  which  it  is  possible 
to  solve  the  PnP  problem. 
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Fig.  3.  Geometry  of  the  P2P  Problem. 


Fig.  4.  Geometry  of  the  P3P  Problem  (L  is  the  Center  of  Perspective). 
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A.  Solution  of  the  Perspective-n-Point  Problem 

The  PIP  problem  (n  =  1)  provides  no  constraining 
information,  and  thus  an  infinity  of  solutions  is  possible. 
The  P2P  problem  ( n  =  2),  illustrated  in  Figure  3,  also 
admits  an  infinity  of  solutions;  the  CP  can  reside  any¬ 
where  on  a  circle  of  diameter  Rab/sin(0a/>),  rotated  in 
space  about  the  chord  (line)  joining  the  two  control 
points  A  and  B. 

The  P3P  problem  ( n  =  3)  requires  that  we  determine 
the  lengths  of  the  three  legs  of  a  tetrahedron,  given  the 
base  dimensions  and  the  face  angles  of  the  opposing 
trihedral  angle  (see  Figure  4).  The  solution  to  this  prob¬ 
lem  is  implied  by  the  three  equations  [A*]: 

( Rabf  =  d 2  +  b2-  2*a*b*[cos(6ab)] 

( Racf  =  a1  +  c1  —  2*a*c*  [cos(ftzc)]  [A*] 

{Rbcf  =  b2  +  c2  —  2 *b*c*  [cos(#6c)] 

It  is  known  that  n  independent  polynomial  equations, 
in  n  unknowns,  can  have  no  more  solutions  than  the 
product  of  their  respective  degrees  [2],  Thus,  the  system 
A*  can  have  a  maximum  of  eight  solutions.  However, 
because  every  term  in  the  system  A*  is  either  a  constant 
or  of  second  degree,  for  every  real  positive  solution  there 
is  a  geometrically  isomorphic  negative  solution.  Thus, 
there  are  at  most  four  positive  solutions  to  A*,  and  in 
Figure  5  we  show  an  example  demonstrating  that  the 
upper  bound  of  four  solutions  is  attainable. 


In  Appendix  A  we  derive  an  explicit  algebraic  solu¬ 
tion  for  the  system  A*.  This  is  accomplished  by  reducing 
A  *  to  a  biquadratic  (quartic)  polynomial  in  one  unknown 
representing  the  ratio  of  two  legs  of  the  tetrahedron,  and 
then  directly  solving  this  equation  (we  also  present  a 
very  simple  iterative  method  for  obtaining  the  solutions 
from  the  given  problem  data). 

For  the  case  n  =  4,  when  all  four  control  points  lie  in 
a  common  plane  (not  containing  the  CP,  and  such  that 
no  more  than  two  of  the  control  points  lie  on  any  single 
line),  we  provide  a  technique  in  Appendix  B  that  will 
always  produce  a  unique  solution.  Surprisingly,  when  all 
four  control  points  do  not  lie  in  the  same  plane,  a  unique 
solution  cannot  always  be  assured;  for  example,  Figure 
6  shows  that  at  least  two  solutions  are  possible  for  the 
P4P  problem  with  the  control  points  in  “general  posi¬ 
tion.” 

To  solve  for  the  location  of  the  CP  in  the  case  of  four 
nonplanar  control  points,  we  can  use  the  algorithm 
presented  in  Appendix  A  on  two  distinct  subsets  of  the 
control  points  taken  three  at  a  time;  the  solution(s) 
common  to  both  subsets  locate  the  CP  to  within  the 
ambiguity  inherent  in  the  given  information. 

The  approach  used  to  construct  the  example  shown 
in  Figure  6  can  be  extended  to  any  number  of  additional 
points.  It  is  based  on  the  principle  depicted  in  Figure  3: 
If  the  CP  and  any  number  of  control  points  lie  on  the 
same  circle,  then  the  angle  between  any  pair  of  control 
points  and  the  CP  will  be  independent  of  the  location  on 
the  circle  of  the  CP  (and  hence  the  location  of  the  CP 
cannot  be  determined).  Thus,  we  are  able  to  construct 
the  example  shown  in  Figure  7,  in  which  five  control 
points  in  general  position  imply  two  solutions  to  the  P5P 
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Fig.  5.  An  Example  Showing  Four  Distinct  Solutions  to  a  P3P  Problem. 
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Consider  the  tetrahedron  in  Figure  5(a).  The  base 
ABC  is  an  equilateral  triangle  and  the  “legs”  (i.e.,  LA, 
LB,  and  LC )  are  all  equal.  Therefore,  the  three  face 
angles  at  L  (i.e.,  <ALB,  <ALC,  and  <BLC)  are  all 
equal.  By  the  law  of  cosines  we  have: 

Cos  (a)  =  5/8. 

This  tetrahedron  defines  one  solution  to  a  P3P  prob¬ 
lem.  A  second  solution  is  shown  in  Figure  5(b).  It  is 
obtained  from  the  first  by  rotating  L  about  BC.  It  is 
necessary  to  verify  that  the  length  of  L'A  can  be  1 , 
given  the  rigid  triangle  ABC  and  the  angle  alpha.  From 
the  law  of  cosines  we  have: 

(2*y/3f  =  42  +  (L'A)2  -  2 ’4* (L'A) *(5/8) 

which  reduces  to: 

(L'A  -  1)  *  (L'A  -  4)  =  0. 

Therefore,  L'A  can  be  either  1  or  4.  Figure  5(a) 
illustrates  the  L'A  =  4  case  and  Figure  5(b)  illustrates 
the  L'A  =  1  case. 

Notice  that  repositioning  the  base  triangle  so  that 
its  vertices  move  to  different  locations  on  the  legs  is 
equivalent  to  repositioning  L.  Figure  5(c)  shows  the 
position  of  the  base  triangle  that  corresponds  to  the 
second  solution. 

Since  the  tetrahedron  in  Figure  5(a)  is  threefold 
rotationally  symmetric,  two  more  solutions  can  be 
obtained  by  rotating  the  triangle  about  AB  and  AC. 


(0 
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Fig.  6.  An  Example  of  a  P4P  Problem  with  Two  Solutions. 


Figure  6(a)  specifies  a  P4P  problem  and  demon¬ 
strates  one  solution.  A  second  solution  can  be 
achieved  by  rotating  the  base  about  BC  so  that  A  is 
positioned  at  a  different  point  on  its  leg  (see  Figure 
6(b)).  To  verify  that  this  is  a  valid  solution  consider  the 
plane  X  =  O,  which  is  normal  to  BC  and  contains  the 
points,  L,  A,  and  D.  Figure  6(c)  shows  the  important 
features  in  this  plane.  The  cosine  of  alpha  is  119/ 
1 69.  A  rotation  of  beta  about  BC  repositions  A  at  A'. 
The  law  of  cosines  can  be  used  to  verify  the  position 
of  A'. 

To  complete  this  solution  it  is  necessary  to  verify 
that  the  rotated  position  of  D  is  on  LD.  Consider  the 
point  D'  in  Figure  6(c).  It  is  at  the  same  distance  from 
P  as  D  is  and  by  the  law  of  cosines  we  can  show  that 
gamma  equals  beta.  Therefore,  O',  which  is  on  LD,  is 
the  rotated  position  of  D.  The  points  A',  B,  C,  and  D' 
form  the  second  solution  to  the  problem. 


problem.  While  the  same  technique  will  work  for  six  or 
more  control  points,  four  or  more  of  these  points  must 
now  lie  in  the  same  plane  and  are  thus  no  longer  in 
general  position. 

To  prove  that  six  (or  more)  control  points  in  general 
position  will  always  produce  a  unique  solution  to  the 
P6P  problem,  we  note  that  for  this  case  we  can  always 


solve  for  the  12  coefficients  of  the  3x4  matrix  T  that 
specifies  the  mapping  (in  homogeneous  coordinates) 
from  3-space  to  2-space;  each  of  the  six  correspondences 
provides  three  new  equations  and  introduces  one  addi¬ 
tional  unknown  (the  homogeneous  coordinate  scale  fac¬ 
tor).  Thus,  for  six  control  points,  we  have  18  linear 
equations  to  solve  for  the  1 8  unknowns  (actually,  it  can 
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be  shown  that,  at  most,  17  of  the  unknowns  are  indepen¬ 
dent).  Given  the  transformation  matrix  T,  we  can  con¬ 
struct  an  additional  (synthetic)  control  point  lying  in  a 
common  plane  with  three  of  the  given  control  points  and 
compute  its  location  in  the  image  plane;  the  technique 
described  in  Appendix  B  can  now  be  used  to  find  a 
unique  solution. 

IV.  Implementation  Details  and  Experimental  Results 

A.  The  ransac/LD  Algorithm 

The  ransac/LD  algorithm  accepts  as  input  the  fol¬ 
lowing  data: 

(1)  A  list  L  of  m  6-tuples — each  6-tuple  containing  the  3-D  spatial 
coordinates  of  a  control  point,  its  corresponding  2-D  image  plane 
coordinates,  and  an  optional  number  giving  the  expected  error 
(in  pixels)  of  the  given  location  in  the  image  plane. 

(2)  The  focal  length  of  the  imaging  system  and  the  image  plane 
coordinates  of  the  principal  point. 

(3)  The  probability  (1  —  tv)  that  a  6-tuple  contains  a  gross  mismatch. 

(4)  A  “confidence”  number  G  which  is  used  to  set  the  internal 
thresholds  for  acceptance  of  intermediate  results  contributing  to 
a  solution.  A  confidence  number  of  one  forces  very  conservative 
behavior  on  the  algorithm;  a  confidence  number  of  zero  will  call 
almost  anything  a  valid  solution. 

Fig.  7.  An  Example  of  a  P5P  Problem  with  Two  Solutions. 


L 


This  example  is  the  same  as  the  P4P  example 
described  in  Figure  6  except  that  a  fifth  control  point, 
E,  has  been  added.  The  initial  position  for  E  and  its 
rotated  position,  £',  are  shown  in  Figure  7.  The  points 
E  and  E'  were  constructed  to  be  the  mirror  images  of 
A'  and  A  about  the  line  LP;  therefore,  a  rotation  of 
alpha  about  P  repositions  E  at  E'.  One  solution  of  the 
P5P  problem  is  formed  by  points  A,  B ,  C,  and  D 
(shown  in  Figure  6(a))  plus  point  E.  The  second  solu¬ 
tion  is  formed  by  points  A',  B,  C,  D',  and  E'.  Conse¬ 
quently  there  are  two  different  positions  of  L  such  that 
all  five  points  lie  on  their  appropriate  legs. 
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The  ransac/LD  algorithm  produces  as  output  the 
following  information: 

(1)  The  3-D  spatial  coordinates  of  the  lens  center  (i.e.,  the  Center  of 
Perspective),  and  an  estimate  of  the  corresponding  error. 

(2)  The  spatial  orientation  of  the  image  plane. 

The  ransac/LD  algorithm  operates  as  follows: 

(1)  Three  6-tuples  are  selected  from  list  L  by  a  quasirandom  method 
that  ensures  a  reasonable  spatial  distribution  for  the  correspond¬ 
ing  control  points.  This  initial  selection  is  called  SI. 

(2)  The  CP  (called  CPI)  corresponding  to  selection  SI  is  determined 
using  the  closed-form  solution  provided  in  Appendix  A;  multiple 
solutions  are  treated  as  if  they  were  obtained  from  separate 
selections  in  the  following  steps. 

(3)  The  error  in  the  derived  location  of  CP  1  is  estimated  by  perturbing 
the  given  image  plane  coordinates  of  the  three  selected  control 
points  (either  by  the  amount  specified  in  the  6-tuples  or  by  a 
default  value  of  one  pixel),  and  recomputing  the  effect  this  would 
have  on  the  location  of  the  CPI. 

(4)  Given  the  error  estimate  for  the  CPI,  we  use  the  technique 
described  in  [1]  to  determine  error  ellipses  (dimensions  based 
upon  the  supplied  confidence  number)  in  the  image  plane  for 
each  of  the  control  points  specified  in  list  L;  if  the  associated 
image  coordinates  reside  within  the  corresponding  error  ellipse, 
then  the  6-tuple  is  appended  to  the  consensus  set  SI  /CPI. 

(5)  If  the  size  of  S1/CP1  equals  or  exceeds  some  threshold  value  t 
(nominally  equal  to  a  value  between  7  and  mw),  then  the  consen¬ 
sus  set  SI /CPI  is  supplied  to  a  least-squares  routine  (see  [1]  or 
[7])  for  final  determination  of  the  CP  location  and  image  plane 
orientation.1  Otherwise,  the  above  steps  are  repeated  with  a  new 
random  selection  S2,  S3,  . . . 

(6)  If  the  number  of  iterations  of  the  above  steps  exceeds  k  =  [log 
(1  —  G)]/[log(  1  -  tv3)],  then  the  largest  consensus  set  found  so 
far  is  used  to  compute  the  final  solution  (or  we  terminate  in  failure 
if  this  largest  consensus  set  contains  fewer  than  six  members). 

B.  Experimental  Results 

To  demonstrate  the  validity  of  our  theoretical  results, 
we  performed  three  experiments.  In  the  first  experiment 
we  found  a  specific  LDP  in  which  the  common  least- 
squares  pruning  heuristic  failed,  and  showed  that  ransac 
successfully  solved  this  problem.  In  the  second  experi¬ 
ment,  we  applied  ransac  to  50  synthetic  problems  in 
order  to  check  the  reliability  of  the  approach  over  a  wide 
range  of  parameter  values.  In  the  third  experiment  we 
used  standard  feature  detection  techniques  to  locate 
landmarks  in  an  aerial  image  and  then  used  ransac  to 
determine  the  position  and  orientation  of  the  camera. 

C.  A  Location  Determination  Problem  Example  of  a 
Least  Squares  Pruning  Error 

The  LDP  in  this  experiment  was  based  upon  20 
landmarks  and  their  locations  in  an  image.  Five  of  the 
20  correspondences  were  gross  errors;  that  is,  their  given 
locations  in  the  image  were  further  than  10  pixels  from 
their  actual  locations.  The  image  locations  for  the  good 


1  An  alternative  to  least  squares  would  be  to  average  the  param¬ 
eters  computed  from  random  triples  in  the  consensus  set  that  fall 
within  (say)  the  center  50  percent  of  the  associated  histogram. 
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correspondences  were  normally  distributed  about  their 
actual  locations  with  a  standard  deviation  of  one  pixel. 

The  heuristic  to  prune  gross  errors  was  the  following: 

*  Use  all  of  the  correspondences  to  instantiate  a  model. 

*  On  the  basis  of  that  model,  delete  the  correspondence 
that  has  the  largest  deviation  from  its  predicted  image 
location. 

*  Instantiate  a  new  model  without  that  correspondence. 

*  If  the  new  model  implies  a  normalized  error  for  the 
deleted  correspondence  that  is  larger  than  three  stan¬ 
dard  deviations,  assume  that  it  is  a  gross  error,  leave  it 
out,  and  continue  deleting  correspondences.  Otherwise, 
assume  that  it  is  a  good  correspondence  and  return  the 
model  that  included  it  as  the  solution  to  the  problem. 

This  heuristic  successfully  deleted  two  of  the  gross 
errors;  but  after  deleting  a  third,  it  decided  that  the  new 
model  did  not  imply  a  significantly  large  error,  so  it 
returned  a  solution  based  upon  18  correspondences,  three 
of  which  were  gross  errors.  When  ransac  was  applied 
to  this  problem,  it  located  the  correct  solution  on  the 
second  triple  of  selected  points.  The  final  consensus  set 
contained  all  of  the  good  correspondences  and  none  of 
the  gross  errors. 

D.  50  Synthetic  Location  Determination  Problems 

In  this  experiment  ransac  was  applied  to  50  syn¬ 
thetic  LDPs.  Each  problem  was  based  upon  30  land- 
mark-to-image  correspondences.  A  range  of  probabilities 
were  used  to  determine  the  number  of  gross  errors  in  the 
problems;  the  image  location  of  a  gross  error  was  at  least 
10  pixels  from  its  actual  location.  The  location  of  a  good 
correspondence  was  distributed  about  its  actual  location 
with  a  normal  distribution  having  a  standard  deviation 
of  one  pixel.  Two  different  camera  positions  were  used — 
one  looking  straight  down  on  the  landmarks  and  one 
looking  at  them  from  an  oblique  angle.  The  ransac 
algorithm  described  earlier  in  this  section  was  applied  to 
these  problems;  however,  the  simple  iterative  technique 
described  in  Appendix  A  was  used  to  locate  solutions  to 
the  P3P  problems  in  place  of  the  closed  form  method 
also  described  in  that  appendix,  and  a  second  least- 
squares  fit  was  used  to  extend  the  final  consensus  set  (as 
suggested  in  Sec.  II  of  this  paper).  Table  I  summarizes 
the  results  for  ten  typical  problems  (ransac  successfully 
avoided  including  a  gross  error  in  its  final  consensus  set 
in  all  of  the  problems);  in  five  of  these  problems  the 
probability  of  a  good  correspondence  was  0.8,  and  in  the 
other  five  problems,  it  was  0.6.  The  execution  time  for 
the  current  program  is  approximately  1  sec  for  each 
camera  position  considered. 

E.  A  “Real”  Location  Determination  Problem 

Cross  correlation  was  used  to  locate  25  landmarks  in 
an  aerial  image  taken  from  approximately  4,000  ft  with 
a  6  in.  lens.  The  image  was  digitized  on  a  grid  of  2,000 
X  2,000  pixels,  which  implies  a  ground  resolution  of 
approximately  2  ft  per  pixel.  Three  gross  errors  were 
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Table  I.  Typical  Experimental  Results  Using  ransac. 


No.  of  Good 
Correspondences 

No.  of 

Correspondences 
in  Final 
Consensus  Set 

No.  of 
Triples 
Considered 

No.  of 
Camera 
Positions 
Considered 

22 

w  =  0.8 

19 

6 

10 

23 

23 

1 

3 

19 

19 

2 

3 

25 

25 

1 

2 

24 

23 

3 

8 

21 

w  -  0.6 
20 

11 

21 

17 

17 

1 

1 

17 

16 

6 

8 

18 

16 

9 

21 

21 

18 

9 

15 

made  by  the  correlation  feature  detector.  When  ransac 
was  applied  to  this  problem,  it  located  a  consensus  set  of 
17  on  the  first  triple  selected  and  then  extended  that  set 
to  include  all  22  good  correspondences  after  the  initial 
least-squares  fit.  The  final  standard  deviations  about  the 
camera  parameters  were  as  follows: 

A1:  0.1  ft  Heading:  0.01° 

7:6.4  ft  Pitch:  0.10° 

Z:  2.1  ft  Roll:  0.12° 


V.  Concluding  Comments 

In  this  paper  we  have  introduced  a  new  paradigm, 
Random  Sample  Consensus  (ransac),  for  fitting  a  model 
to  experimental  data,  ransac  is  capable  of  interpreting/ 
smoothing  data  containing  a  significant  percentage  of 
gross  errors,  and  thus  is  ideally  suited  for  applications  in 
automated  image  analysis  where  interpretation  is  based 
on  the  data  provided  by  error-prone  feature  detectors. 

A  major  portion  of  this  paper  describes  the  applica¬ 
tion  of  ransac  to  the  Location  Determination  Problem 
(LDP):  Given  an  image  depicting  a  set  of  landmarks 
with  known  locations,  determine  that  point  in  space  from 
which  the  image  was  obtained.  Most  of  the  results  we 
presented  concerning  solution  techniques  and  the  ge¬ 
ometry  of  the  LDP  problem  are  either  new  or  not 
generally  known.  The  current  photogrammetric  litera¬ 
ture  offers  no  analytic  solution  other  than  variants  of 
least  squares  and  the  Church  method  for  solving  per- 
spective-n-point  problems.  The  Church  method,  which 
provides  an  iterative  solution  for  the  P3P  problem 
[3,  11],  is  presented  without  any  indication  that  more 
than  one  physically  real  solution  is  possible;  there  is 
certainly  no  indication  that  anyone  realizes  that  physi¬ 
cally  real  multiple  solutions  are  possible  for  more  than 
three  control  points  in  general  position.  (It  should  be 
noted  that  because  the  multiple  solutions  can  be  arbi¬ 
trarily  close  together,  even  when  an  iterative  technique 
is  initialized  to  a  value  close  to  the  correct  solution  there 
is  no  assurance  that  it  will  converge  to  the  desired  value.) 
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In  the  section  on  the  LDP  problem  (and  associated 
appendices)  we  have  completely  characterized  the  P3P 
problem  and  provided  a  closed-form  solution.  We  have 
shown  that  multiple  physically  real  solutions  can  exist 
for  the  P4P  and  P5P  problems,  but  also  demonstrated 
that  a  unique  solution  is  assured  when  four  of  the  control 
points  reside  on  a  common  plane  (solution  techniques 
are  provided  for  each  of  these  cases).  The  issue  of 
determining  the  maximum  number  of  solutions  possible 
for  the  P4P  and  P5P  problems  remains  open,  but  we 
have  shown  that  a  unique  solution  exists  for  the  P6P 
problem  when  the  control  points  are  in  general  position. 


Appendix  A.  An  Analytic  Solution  for  the  Perspective- 
3-Point  Problem 

The  main  body  of  this  paper  established  that  P3P 
problems  can  have  as  many  as  four  solutions.  In  this 
appendix  a  closed  form  expression  for  obtaining  these 
solutions  is  derived.  Our  approach  involves  three  steps: 
(1)  Find  the  lengths  of  the  legs  of  the  (“perspective”) 
tetrahedron  given  the  base  (defined  by  the  three  control 
points)  and  the  face  angles  of  the  opposing  trihedral 
angle  (the  three  angles  to  the  three  pairs  of  control  points 
as  viewed  from  the  CP);  (2)  Locate  the  CP  with  respect 
to  the  3-D  reference  frame  in  which,  the  control  points 
were  originally  specified;  (3)  Compute  the  orientation  of 
the  image  plane  with  respect  to  the  reference  frame. 

1.  A  Solution  for  the  Perspective  Tetrahedron  (see  Figure 
4) 

Given  the  lengths  of  the  three  sides  of  the  base  of  a 
tetrahedron  (Rab,  Rac,  Rbc),  and  given  the  correspond¬ 
ing  face  angles  of  the  opposing  trihedral  angle  {dab,  dac, 
dbc),  find  the  lengths  of  the  three  remaining  sides  of  the 
tetrahedron  (a,  b,  c). 

A  solution  to  the  above  problem  can  be  obtained  by 
simultaneously  solving  the  system  of  equations: 


(Rabf  =  a2  +  b2  -  2  *a*b*  cos  {dab),  (Al) 

{Racf  —  a2  +  c2  -  2  *a*c*  cos  {dac),  (A2) 

{Rbcf  =  b2  +  c2  -  2  *b*c*  cos  {dbc).  (A3) 

We  now  proceed  as  follows: 

Let  b  =  x*a  and  c  =  y*a,  (A4) 

{Racf  =  a2  +  (/)*(a2)  -  2*(a2)*y*  cos  {dac),  (A5) 

{Rabf  =a2  +  (x2)*(a2)  -  2*{a2)*x*  cos  {dab),  (A6) 

{Rbcf  =  {x2)*{a2)  +  (f)*(a2)  (A7) 

—  2  *(a2)*x*y*  cos  {dbc). 


From  Eqs.  (A5)  and  (A7) 

[(7toc)2]*[l  +  (/)  -  2*y*  cos  {dac)] 

=  [(/?ac)2]*[(x2)  +  (/)  -  2  *x*y*  cos  (dbc)].  (  } 

From  Eqs.  (A6)  and  (A7) 

[(A6c)2]*[l  +  (x2)  -  2*x*  cos  (dab)] 

=  [(fla&)2]*[(x2)  +  (/)  -  2  *x*y*  cos  (dbc)],  (  ’ 


(Rbc) 

(Racf 


and 


(Rbcf 

(Rabf 


=  K2. 


(A  10) 


From  Eqs.  (A8)  and  (A9) 

0  =  (/)*[1  -  Al]  +  2*y*[Al*  cos  (dac) 

—  x*  cos(0hc)]  +  [(jc2)  —  Al].  ■  ' 

From  Eqs.  (A9)  and  (A  10) 

0  =  (/)  +  2  *y*[-x*  cos  (dbc)] 

+  [(x2)*(l  -  A2)  +  2*x*K2*  cos  (dab)  -  A2],  ^  ’ 


Equations  (All)  and  (A  12)  have  the  form: 

0  =  m*(y 2)  +  p*y  +  q,  (A13) 

0  =  m'*(y 2)  +  p'*y  +  q'.  (A14) 

Multiplying  Eqs.  (A  13)  and  (A  14)  by  m!  and  m,  respec¬ 
tively,  and  subtracting, 

0  =  [p*m’  —  p’*m]*y  +  [m’*q  —  m*q '].  (A15) 

Multiplying  Eqs.  (A  13)  and  (A  14)  by  q'  and  q,  respec¬ 
tively,  substracting,  and  dividing  by  y, 

0  =  [m'*q  -  m*q']*(y 2)  +  [ p'*q  - p*q']*y, 

0  =  \m'*q  —  m*q']*y  +  \p'*q  —  p*q'].  (A16) 

Assuming  m'*q  m*q', 

[(x2)  -  Al]  7*  [(x2)*(l  -  Al)*(l  -  K2)  + 

2*x*A2*(l  -  Al)*  cos  (dab)  -  (1  -  A1)*A2], 

then  Eqs.  (A  15)  and  (A  16)  are  equivalent  to  Eqs.  (A  13) 
and  (A14).  We  now  multiply  Eqs.  (A15)  by  (m'*q  — 
m*q'),  and  Eq.  (A16)  by  ( p*m '  —  p*m),  and  subtract  to 
obtain 

0  =  (m'*q  -  m*q'f 

-  [p*m' —  p'*m]*[p'*q  —  p*q’].  (A17) 

Expanding  Eq.  (A  17)  and  grouping  terms  we  obtain  a 
biquadratic  (quartic)  polynomial  in  x : 


0  =  G4*(x4)  +  G3*(x3)  +  G2*(x2)  +  Gl*(x)  . . 

+  GO,  (Al8) 

where 

G4  =  (Al  * A2  -  Al  -  K2f  q 

-  4*Kl*K2*[cos(dbcf],  {  } 

G3  =  4*[A1*A2  -  Al  -  A2]*A2*(1  -  Al)* 
cos  (dab)  +  4*A1*  cos(dbc)*[(Kl*K2 
+  K2  —  Al)*  cos  (dac) 

+  2*A2*  cos  (dab)*  cos  (dbc)],  (A20) 

G2  =  [2*A2*(1  -  Al)  cos(6'ah)]2 

+  2*[A1*A2  +  Al  -  A2]*[A1*A2  -  Al 

-  A2]  +  4*A1*[(A1  -  A2)*  (cos (dbcf) 

+  (1  -  A2)*A1*  (cos (dacf) 

—  2*A2*(1  +  Al)*  cos  (dab)*  cos  (dac)* 

cos(06c)],  (A21) 

G1  =  4*(A1*A2  +  Al  -  A2)*A2*(1  -  Al)* 
cos  (dab)  +  4*A1*[(A1*A2  -  Al 
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+  K2)*  cos (9ac)*  cos (Obc) 

+  2*K\*K2*  cos  (dab)*  (cos  (Oacf)], 

GO  =(Kl*K2  +  Kl  -  K2f  -  4*(Kl‘J)*K2* 
(cos  (Oacf). 


(A22) 

(A23) 


Roots  of  Eq.  (A  18)  can  be  found  in  closed  form  [5],  or 
by  iterative  techniques  [4].  For  each  positive  real  root  of 
Eq.  (A  18),  we  determine  a  single  positive  real  value  for 
each  of  the  sides  a  and  b.  From  Eq.  (A6)  we  have 


Rab 

SQRT[(x2)  -  2*x*  cos  {dab)  +  1]  ’ 


(A24) 


and  from  Eq.  (A4)  we  obtain 

b  =  a*x.  (A25) 


(A26) 


If  m'*q  7^  m*q',  then  from  Eq.  (A16)  we  have 

p'*q-p*q' 

y  m*q'  —  m'*q  ' 

If  m'*q  —  m*q',  then  Eq.  (A26)  is  undefined  and  we 
obtain  two  values  of  y  from  Eq.  (A5): 

y  =  cos  (8ac) 

±  SQRt\ (cos (6ae))2  + 


(Racf  -  (a2) 


(a2) 


(A27) 


For  each  real  positive  value  of  y,  we  obtain  a  value  of  c 
from  Eq.  (A4): 

c  =  y*a  (A28) 

When  values  of y  are  obtained  from  Eq.  (A5)  rather  than 
Eq.  (A26),  the  resulting  solutions  can  be  invalid;  they 
must  be  shown  to  satisfy  Eq.  (A3)  before  they  are 
accepted. 

It  should  be  noted  that  because  each  root  of  Eq. 
(A  18)  can  conceivably  lead  to  two  distinct  solutions,  the 
existence  of  the  biquadratic  does  not  by  itself  imply  a 
maximum  of  four  solutions  to  the  P3P  problem;  some 
additional  argument,  such  as  the  one  given  in  the  main 
body  of  this  paper,  is  necessary  to  establish  the  upper 
bound  of  four  solutions. 


2.  Example 

For  the  perspective  tetrahedron  shown  in  Figure  5, 
we  have  the  following  parameters: 

Rab  =  Rac  =  Rbc  =  2*SQRT(3), 

cos  (dab)  =  cos  (6ac)  =  cos  (8bc) 

(a2)  +  ( b 2)  -  (Rabf  _  20 

2 *a*b  32 ' 

Substituting  these  values  into  Eqs.  (A  19)  through  (A23), 
we  obtain  the  coefficients  of  the  biquadratic  defined  in 
Eq.  (A  18): 

[-0.5625,  3.515625,  -5.90625,  3.515625,  -0.5625] 

The  roots  of  the  above  equation  are 
[1,  1,4,  0.25] 


For  each  root 


Root  a  b  y  c 

1  4  4  1  4 

1  4  4  0.25  1 

4  14  4  4 

0.25  4  1  1  4 


3.  An  Iterative  Solution  for  the  Perspective  Tetrahedron 
(see  Figure  8) 

A  simple  way  to  locate  solutions  to  P3P  problems, 
which  is  sometimes  an  adequate  substitute  for  the  more 
involved  procedure  described  in  the  preceding  subsec¬ 
tion,  is  to  slide  one  vertex  of  the  control-point  triangle 
down  its  leg  of  the  tetrahedron  and  look  for  positions  of 
the  triangle  in  which  the  other  two  vertices  lie  on  their 
respective  legs.  If  vertex  A  is  at  a  distance  a  from  L  ( L 
is  the  center  of  perspective),  the  lengths  of  the  sides  Rab 
and  Rac  restrict  the  triangle  to  four  possible  positions. 
Given  the  angle  between  legs  LA  and  LB,  compute  the 
distance  of  point  A  from  the  line  LB  and  then  compute 
points  B 1  and  B2  on  LB  that  are  the  proper  distance 
from  A  to  insert  a  line  segment  of  length  Rab.  Similarly, 
we  compute  at  most  two  locations  for  C  on  its  leg.  Thus, 
given  a  position  for  A  we  have  found  at  most  four 
positions  for  a  triangle  that  has  one  side  of  length  Rab 
and  one  of  length  Rac.  The  lengths  of  the  third  sides 
( BC )  of  the  four  triangles  vary  nonlinearly  as  point  A  is 
moved  down  its  leg.  Solutions  to  the  problem  can  be 
obtained  by  iteratively  repositioning  A  to  imply  a  third 
side  of  the  required  length. 

4.  Computing  the  3 -D  Location  of  the  Center  of  Per¬ 
spective  (see  Figure  9) 

Given  the  three-dimensional  locations  of  the  three 
control  points  of  a  perspective  tetrahedron,  and  the 
lengths  of  the  three  legs,  the  3-D  location  of  the  center 
of  perspective  can  be  computed  as  follows: 

( 1 )  Construct  a  plane  P 1  that  is  normal  to  AB  and  passes  through  the 
center  of  perspective,  L.  This  plane  can  be  constructed  without 
knowing  the  position  of  L,  which  is  what  we  are  trying  to  compute. 
Consider  the  face  of  the  tetrahedron  that  contains  vertices  A,  B, 
and  L.  Knowing  the  lengths  of  sides  LA,  LB  and  AB,  we  can  use 
the  law  of  cosines  to  find  the  angle  LAB,  and  then  the  projection 
QA  of  LA  on  AB.  (Note  that  angle  LQA  is  a  right  angle,  and  the 
point  Q  is  that  point  on  line  AB  that  is  closest  to  L).  Construct  a 
plane  normal  to  AB  passing  through  Q:  this  plane  also  passes 
through  L. 

(2)  Similarly  construct  a  plane  P2  that  is  normal  to  AC  and  passes 
through  L. 

(3)  Construct  the  plane  Pi  defined  by  the  three  points  A,  B,  and  C. 

(4)  Intersect  planes  PI,  P2,  and  Pi.  By  construction,  the  point  of 
intersection  R  is  the  point  on  Pi  that  is  closest  to  L. 

(5)  Compute  the  length  of  the  line  AR  and  use  that  in  conjunction 
with  the  length  of  LA  to  compute  the  length  of  the  line  RL,  which 
is  the  distance  of  L  from  the  plane  Pi. 

(6)  Compute  the  cross  product  of  vectors  AB  and  A  C  to  form  a  vector 
perpendicular  to  Pi.  Then  scale  that  vector  by  the  length  of  RL 
and  add  it  to  R  to  get  the  3 -D  location  of  the  center  of  perspective 
L. 
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Fig.  8.  Geometry  for  an  Iterative  Solution  to  the  P3P  Problem. 

L 


(a) 


If  the  focal  length  of  the  camera  and  the  principal 
point  in  the  image  plane  are  known,  it  is  possible  to 
compute  the  orientation  of  the  image  plane  with  respect 
to  the  world  coordinate  system;  that  is,  the  location  of 
the  origin  and  the  orientation  of  the  image  plane  coor¬ 
dinate  system  with  respect  to  the  3-D  reference  frame. 
This  can  be  done  as  follows: 


Fig.  9  Computing  the  3 -D  Location  of  the  Center  of  the  Perspective  ( L). 
L 


(1)  Compute  the  3 -D  reference  frame  coordinates  of  the  center  of 
perspective  (as  described  above). 

(2)  Compute  the  3 -D  coordinates  of  the  image  locations  of  the  three 
control  points:  since  we  know  the  3-D  coordinates  of  the  CP  and 
the  control  points,  we  can  compute  the  3-D  coordinates  of  the 
three  rays  between  the  CP  and  the  control  points.  Knowing  the 
focal  length  of  the  imaging  system,  we  can  compute,  and  subtract 
from  each  ray,  the  distance  from  the  CP  to  the  image  plane  along 
the  ray. 

(3)  Compute  the  equation  of  the  plane  containing  the  image  using 
the  three  points  found  in  step  (2).  The  normal  to  this  plane, 
passing  through  the  CP,  gives  us  the  origin  of  the  image  plane 
coordinate  system  (i.e.,  the  3-D  location  of  the  principal  point), 
and  the  Z  axis  of  this  system. 

(4)  The  orientation  of  the  image  plane  about  the  Z  axis  can  be 
obtained  by  computing  the  3-D  coordinates  of  a  vector  from  the 
principal  point  to  any  one  of  the  points  found  in  step  (2). 


Appendix  B.  An  Analytic  Solution  for  the  Perspective- 
4-Point  Problem  (with  all  control  points  lying  in  a 
common  plane) 

In  this  appendix  an  analytic  technique  is  presented 
for  obtaining  a  unique  solution  to  the  P4P  problem  when 
the  four  given  control  points  all  lie  in  a  common  plane. 

1.  Problem  Statement  (see  Figure  10) 

GIVEN:  A  correspondence  between  four  points  lying 
in  a  plane  in  3-D  space  (called  the  object  plane),  and 
four  points  lying  in  a  distinct  plane  (called  the  image 
plane);  and  given  the  distance  between  the  center  of 
perspective  and  the  image  plane  (i.e.,  the  focal  length  of 
the  imaging  system);  and  also  given  the  principal  point 
in  the  image  plane  (i.e.,  the  location,  in  image  plane 
coordinates,  of  the  point  at  which  the  optical  axis  of  the 
lense  pierces  the  image  plane). 

FIND:  the  3-D  location  of  the  center  of  perspective 
relative  to  the  coordinate  system  of  the  object  plane. 
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Fig.  10.  Geometry  of  the  P4P  Problem  (With  all  Control  Points  Lying 
in  a  Common  Plane). 


IMAGE  PLANE 


2.  Notation 

*  Let  the  four  given  image  points  be  labeled  {Pi},  and 
the  four  corresponding  object  points  {Qi}. 

*  We  will  assume  that  the  2-D  image  plane  coordinate 
system  has  its  origin  at  the  principal  point  ( PPI ). 

*  We  will  assume  that  the  object  plane  has  the  equation 
Z  =  0  in  the  reference  coordinate  system.  Standard 
techniques  are  available  to  transform  from  this  coor¬ 
dinate  system  into  a  ground  reference  frame  (e.g.,  see 
[6]  or  [9]). 

*  Homogeneous  coordinates  will  be  assumed  [12], 

*  Primed  symbols  represent  transposed  structures. 


3.  Solution  Procedure 

(a)  Compute  the  3x3  collineation  matrix  T  which 
maps  points  from  object  plane  to  image  plane  (a 
procedure  for  computing  T  is  given  later): 

[pi]  =  t mot i 


where 

[j Pi]  =  [ki*xi,  ki*yi,  ki]\ 
[Qi]  =  [Xi,  Yi,  1]'. 


(Bl) 


(b)  The  ideal  line  in  the  object  plane,  with  coordinates 
[0,  0,  1]'  is  mapped  into  the  vanishing  line  in  the 
image  plane  [  VLI  ]  by  the  transformation: 

[VLI]  =  [inv[7']]'*[0,  0,  1]'.  (B2) 


(c)  Determine  the  distance  DI  from  the  origin  of  the 
image  plane  (PPI)  to  the  vanishing  line  [  VLI]  = 


[nl,  a2,  tz3]': 


DI  = 


a3 

sqrt[(aYf  +  (a2)2] 


(B3) 


(d)  Solve  for  the  dihedral  (tilt)  angle  9  between  the 
image  and  object  planes: 

9  =  arctan(  f/ DI),  (B4) 

where  /  =  focal  length. 

(e)  The  ideal  line  in  the  image  plane  with  coordinates 
[0,  0,  1]'  is  mapped  into  the  vanishing  line  in  the 
object  plane  [  VLO]  by  the  transform 

[VLO]  =  [r]'*[0,  0,  1]'.  (B5) 

(f)  Compute  the  location  of  point  [PPO]  in  the  object 
plane  ([PPO]  is  the  point  at  which  the  optical  axis 
of  the  lense  pierces  the  object  plane): 

[PPO]  =  [inv[r]]'*[0,  0,  1]'  (B6) 

(g)  Compute  the  distance  DO  from  [PPO]  =  [cl,  c2, 
c3]'  to  the  vanishing  line  [VLO]  =  [b\,  62,  63]'  in 
the  object  plane: 

61*cl  +  62*c2  +63*c3 

D°  “  c3*sqrt[(blf  +  (62)2]  '  (  ) 

(h)  Solve  for  the  “pan”  angle  $  as  the  angle  between 
the  normal  to  [  VLO]  =  [61,  62,  63]'  and  the  X  axis 
in  the  object  plane: 

$  =  arctan(— 62/61).  (B8) 


(i)  Determine  XSGN  and  YSGN:  If  a  line  (parallel  to 
the  X  axis  in  the  object  plane)  through  [PPO] 
intersects  [VLO]  to  the  right  of  [PPO],  then 
XSGN  =  1.  Otherwise  XSGN  =  -1.  Thus, 


61*cl  +  62*c2  +  63*c3  „ 

if -  <  0, 

61*c3 

then  XSGN  =  1 ,  otherwise  XSGN  =  - 1 .  (B9) 

Similarly, 

61*cl  +  62*c2  +  63*c3  „ 

lf - ws - <0 

then  YSGN  =  1 ,  otherwise  YSGN  =  - 1 .  (B10) 


( j )  Solve  for  the  location  of  the  CP  in  the  object  plane 
coordinate  system: 


DCP  =  DO*  sin(0)  (Bl  1) 

XCP  =  XSGN* abs[DCP*  sin(0)*  cos($)] 

+  cl/c3  (B12) 

TCP  =  YSGN*abs[DCP*  sin(0)*  sin($)] 

+  c2/c3  (B13) 

ZCP  =  DC P*  cos (9)  (£14) 


Note:  If  [  VLI],  as  determined  in  (b),  has  the  coordinates 
[0,  0,  k ],  then  the  image  and  object  planes  are 
parallel  (0  =  0).  Rather  than  continuing  with  the 
above  procedure,  we  now  solve  for  the  desired 
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information  using  similar  triangles  and  Euclidean 
geometry. 

4.  Computing  the  Collineation  Matrix  T 
Let 


II  XI  Y 1  1 1| 

[■ Q ]  =11*2  ra  1 II  =  [[01]\  [02]',  [03]'], 
II  A3  13  1 II 

||xlyl  1|| 

[P]  =  ||x272  1||  =  [[Pl]',[7>2]',[7>3]'], 
11*3/3  1|| 

[04]=[3T4,  14,  1]', 

[P4]  =  [x4,/4,  1]', 

[F]  =  [inv[>]]'*[P4]  =  [vl,  v2,  v3]', 

[R]  =  [inv[0]]'*[04]  =  [rl,  r2,  r3]', 
vl  .  r3 

W  ~  rl  v3  ’ 

-  y2  *  r3 

II  ""I  o  0|| 

[w]  =  ||  0  w2  0||. 

II 0  0  1|| 

Then, 

[77  =  [inv[0]]*[  W]*[P  ] 
such  that 
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[ft]  =ki*[xi,yi,  l]  =  [T\*\Qi\. 


S.  Example 
Given; 

/  =  0.3048  m  (12  in.) 

PI  =  (—0.071263,  0.029665)  Ql  =  (  -30,  80) 

P2  =  (-0.053033,  -0.006379)  02  =  (-100,  -20) 

7*3  =  (—0.014063,  0.061579)  03  =  (  140,  50) 

P4=  (  0.080120,-0.030305)  04  =  (  -40, -240) 

|  0.000212  0.000236  0.000925 1| 
|-0.000368  0.000137  0.000534 1| 
1-0.025404  0.021650  0.843879 1| 

|  1117.14  -2038.86  0.0  || 

j  3371.56  2302.22  —5.14991  jj 

1-51.0636  -120.442  1.31713 1| 

(b)  [FL/]  =  [0, -5.14991,  1.31713]' 

(c)  7)7  =  0.255758 

(d)  6  —  0.872665  rad  (50°) 

(e)  [  VLO ]  =  [0.000925,  0.000534,  0.843880]' 

(f)  \_PPO]  =  [-51.0636,  -120.442,  1.31713]' 

(g)  DO  =  711.196 

(h)  $  =  -0.523599  rad  (-30°) 

XSGN=- 1 

W  LSGA^-l 

(j)  DCP=  544.8081 

XCP  =  -400.202 
YCP  =  -300.117 
ZCP  =  350.196 
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